AITopics | time accuracy tradeoff

Collaborating Authors

time accuracy tradeoff

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Neural Information Processing SystemsDec-24-2025, 23:41:32 GMT

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $\opt < 1$ be the population loss of the best-fitting ReLU. We prove: \begin{itemize} \item Finding a ReLU with square-loss $\opt + \epsilon$ is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply --{\em unconditionally}-- that gradient descent cannot converge to the global minimum in polynomial time.

best-fitting relu, relu, time accuracy tradeoff, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Neural Information Processing SystemsJan-21-2025, 08:28:27 GMT

This paper studies the computational complexity of learning a single ReLU with respect to Gaussian examples. Since ReLUs are now the standard choice of nonlinearity in deep neural networks, the computational complexity of learning them is clearly of interest. Of course, the computational complexity of learning a ReLU may depend substantially on the specific setting assumed; it is interesting to understand the range of such assumptions and their implications for complexity. This paper studies the following setting: given independent samples (x_1,y_1), ..., (x_n,y_n) where x is spherical Gaussian in d dimensions and y \in R is arbitrary, find a ReLU function f_w(x) max(0, w \cdot x) for some vector w with minimal mean squared error \sum (y_i - f(x_i)) 2. (This is agnostic learning since the y's are arbitrary.) The main results are as follows: 1) There is no algorithm to learn a single ReLU with respect to Gaussian examples to additive error \epsilon in time d {o(log 1/\epsilon)} unless k -sparse parities with noise can be learned in time d {o(k)} 2) If opt min_{w} (mean squared error of f) then (with normalization such that opt \in [0,1]) there is an algorithm which agnostically learns a ReLU to error opt {2/3} \epsilon in time poly(d,1/\epsilon). The proof of (1) goes via Hermite analysis (i.e.

algorithm, complexity, relu, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.57)

Add feedback

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Neural Information Processing SystemsOct-9-2024, 11:31:09 GMT

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let \opt 1 be the population loss of the best-fitting ReLU. We prove: \begin{itemize} \item Finding a ReLU with square-loss \opt \epsilon is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply --{\em unconditionally}-- that gradient descent cannot converge to the global minimum in polynomial time. The algorithm uses a novel reduction to noisy halfspace learning with respect to 0/1 loss.

best-fitting relu, relu, time accuracy tradeoff, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Goel, Surbhi, Karmalkar, Sushrut, Klivans, Adam

Neural Information Processing SystemsMar-19-2020, 00:03:18 GMT

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $\opt 1$ be the population loss of the best-fitting ReLU. We prove: \begin{itemize} \item Finding a ReLU with square-loss $\opt \epsilon$ is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply --{\em unconditionally}-- that gradient descent cannot converge to the global minimum in polynomial time. The algorithm uses a novel reduction to noisy halfspace learning with respect to $0/1$ loss.

best-fitting relu, relu, time accuracy tradeoff, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback